125 research outputs found

    Simple approximate MAP Inference for Dirichlet processes

    Full text link
    The Dirichlet process mixture (DPM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as Gibb's sampling are required. As a result, DPM-based methods, which have considerable potential, are restricted to applications in which computational resources and time for inference is plentiful. For example, they would not be practical for digital signal processing on embedded hardware, where computational resources are at a serious premium. Here, we develop simplified yet statistically rigorous approximate maximum a-posteriori (MAP) inference algorithms for DPMs. This algorithm is as simple as K-means clustering, performs in experiments as well as Gibb's sampling, while requiring only a fraction of the computational effort. Unlike related small variance asymptotics, our algorithm is non-degenerate and so inherits the "rich get richer" property of the Dirichlet process. It also retains a non-degenerate closed-form likelihood which enables standard tools such as cross-validation to be used. This is a well-posed approximation to the MAP solution of the probabilistic DPM model.Comment: 11 pages, 4 Figures, 5 Table

    Simple approximate MAP inference for Dirichlet processes mixtures

    Get PDF
    The Dirichlet process mixture model (DPMM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as Gibbs sampling are required. As a result, DPMM-based methods, which have considerable potential, are restricted to applications in which computational resources and time for inference is plentiful. For example, they would not be practical for digital signal processing on embedded hardware, where computational resources are at a serious premium. Here, we develop a simplified yet statistically rigorous approximate maximum a-posteriori (MAP) inference algorithm for DPMMs. This algorithm is as simple as DP-means clustering, solves the MAP problem as well as Gibbs sampling, while requiring only a fraction of the computational effort. (For freely available code that implements the MAP-DP algorithm for Gaussian mixtures see http://www.maxlittle.net/.) Unlike related small variance asymptotics (SVA), our method is non-degenerate and so inherits the “rich get richer” property of the Dirichlet process. It also retains a non-degenerate closed-form likelihood which enables out-of-sample calculations and the use of standard tools such as cross-validation. We illustrate the benefits of our algorithm on a range of examples and contrast it to variational, SVA and sampling approaches from both a computational complexity perspective as well as in terms of clustering performance. We demonstrate the wide applicabiity of our approach by presenting an approximate MAP inference method for the infinite hidden Markov model whose performance contrasts favorably with a recently proposed hybrid SVA approach. Similarly, we show how our algorithm can applied to a semiparametric mixed-effects regression model where the random effects distribution is modelled using an infinite mixture model, as used in longitudinal progression modelling in population health science. Finally, we propose directions for future research on approximate MAP inference in Bayesian nonparametrics

    Predicting room occupancy with a single passive infrared (PIR) sensor through behavior extraction

    Get PDF
    Passive infrared sensors have widespread use in many applications, including motion detectors for alarms, lighting systems and hand dryers. Combinations of multiple PIR sensors have also been used to count the number of humans passing through doorways. In this paper, we demonstrate the potential of the PIR sensor as a tool for occupancy estimation inside of a monitored environment. Our approach shows how flexible nonparametric machine learning algorithms extract useful information about the occupancy from a single PIR sensor. The approach allows us to understand and make use of the motion patterns generated by people within the monitored environment. The proposed counting system uses information about those patterns to provide an accurate estimate of room occupancy which can be updated every 30 seconds. The system was successfully tested on data from more than 50 real office meetings consisting of at most 14 room occupants

    Optimal design for correlated processes with input-dependent noise

    Get PDF
    Optimal design for parameter estimation in Gaussian process regression models with input-dependent noise is examined. The motivation stems from the area of computer experiments, where computationally demanding simulators are approximated using Gaussian process emulators to act as statistical surrogates. In the case of stochastic simulators, which produce a random output for a given set of model inputs, repeated evaluations are useful, supporting the use of replicate observations in the experimental design. The findings are also applicable to the wider context of experimental design for Gaussian process regression and kriging. Designs are proposed with the aim of minimising the variance of the Gaussian process parameter estimates. A heteroscedastic Gaussian process model is presented which allows for an experimental design technique based on an extension of Fisher information to heteroscedastic models. It is empirically shown that the error of the approximation of the parameter variance by the inverse of the Fisher information is reduced as the number of replicated points is increased. Through a series of simulation experiments on both synthetic data and a systems biology stochastic simulator, optimal designs with replicate observations are shown to outperform space-filling designs both with and without replicate observations. Guidance is provided on best practice for optimal experimental design for stochastic response models

    OscoNet: Inferring oscillatory gene networks

    Get PDF
    Background: Oscillatory genes, with periodic expression at the mRNA and/or protein level, have been shown to play a pivotal role in many biological contexts. However, with the exception of the circadian clock and cell cycle, only a few such genes are known. Detecting oscillatory genes from snapshot single-cell experiments is a challenging task due to the lack of time information. Oscope is a recently proposed method to identify co-oscillatory gene pairs using single-cell RNA-seq data. Although promising, the current implementation of Oscope does not provide a principled statistical criterion for selecting oscillatory genes. Results: We improve the optimisation scheme underlying Oscope and provide a wellcalibrated non-parametric hypothesis test to select oscillatory genes at a given FDR threshold. We evaluate performance on synthetic data and three real datasets and show that our approach is more sensitive than the original Oscope formulation, discovering larger sets of known oscillators while avoiding the need for less interpretable thresholds. We also describe how our proposed pseudo-time estimation method is more accurate in recovering the true cell order for each gene cluster while requiring substantially less computation time than the extended nearest insertion approach. Conclusions: OscoNet is a robust and versatile approach to detect oscillatory gene networks from snapshot single-cell data addressing many of the limitations of the original Oscope method

    Does anticholinergics drug burden relate to global neuro-disability outcome measures and length of hospital stay?

    Get PDF
    Primary objective: To assess the relationship between disability, length of stay (LOS) and anticholinergic burden (ACB) with people following acquired brain or spinal cord injury. Research design: A retrospective case note review assessed total rehabilitation unit admission. Methods and procedures: Assessment of 52 consecutive patients with acquired brain/spinal injury and neuropathy in an in-patient neuro-rehabilitation unit of a UK university hospital. Data analysed included: Northwick Park Dependency Score (NPDS), Rehabilitation complexity Scale (RCS), Functional Independence Measure and Functional Assessment Measure FIM-FAM (UK version 2.2), LOS and ACB. Outcome was different in RCS, NPDS and FIM-FAM between admission and discharge. Main outcomes and results: A positive change was reported in ACB results in a positive change in NPDS, with no significant effect on FIM-FAM, either Motor or Cognitive, or on the RCS. Change in ACB correlated to the length of hospital stay (regression correlation = −6.64; SE = 3.89). There was a significant harmful impact of increase in ACB score during hospital stay, from low to high ACB on NPDS (OR = 9.65; 95% CI = 1.36–68.64) and FIM-FAM Total scores (OR = 0.03; 95% CI = 0.002–0.35). Conclusions: There was a statistically significant correlation of ACB and neuro-disability measures and LOS amongst this patient cohort

    Design of Experiments for Screening

    Full text link
    The aim of this paper is to review methods of designing screening experiments, ranging from designs originally developed for physical experiments to those especially tailored to experiments on numerical models. The strengths and weaknesses of the various designs for screening variables in numerical models are discussed. First, classes of factorial designs for experiments to estimate main effects and interactions through a linear statistical model are described, specifically regular and nonregular fractional factorial designs, supersaturated designs and systematic fractional replicate designs. Generic issues of aliasing, bias and cancellation of factorial effects are discussed. Second, group screening experiments are considered including factorial group screening and sequential bifurcation. Third, random sampling plans are discussed including Latin hypercube sampling and sampling plans to estimate elementary effects. Fourth, a variety of modelling methods commonly employed with screening designs are briefly described. Finally, a novel study demonstrates six screening methods on two frequently-used exemplars, and their performances are compared

    Thirty Years with EoS/G<sup>E</sup> Models - What Have We Learned?

    Get PDF
    corecore